Goto

Collaborating Authors

 Toronto


Causal Risk Minimization for High-Dimensional Treatments

arXiv.org Machine Learning

Predicting the effect of interventions with many possible variations, e.g., therapeutic content that affects mental health outcomes or an earnings call transcript that drives movement in share price, is useful across several domains. However, classical causal estimators tend to assume that all possible interventions are observed, which is infeasible when interventions vary widely, for instance, in the space of all text strings. We adapt a well-known approach of recasting causal inference as a learning problem, to address high-dimensional treatment spaces. Specifically, under standard assumptions like no unobserved confounding, we show that causal error decomposes into a series of moment-balancing errors of increasing order, and design objectives that directly improve causal estimation. We also show how to project the effect of a high-dimensional treatment onto lower-dimensional treatment attributes, which allows a single model to answer several causal questions without additional attribute-specific training. We empirically evaluate our estimators in settings with high-dimensional continuous, discrete, and text treatments, the last of which used a semi-synthetic dataset of Amazon Reviews. Our experiments demonstrate the benefit of higher-order balance error optimization and competitive performance of projected causal estimates with attribute-specific estimators.


Paul Anka tells Bill Maher crime has gone 'through the roof' in Canada amid recent immigration

FOX News

Paul Anka says Toronto's crime rate has spiked amid the arrival over 400,000 new immigrants, telling Bill Maher that Canada was homogenous until recently.


SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

arXiv.org Machine Learning

Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).


The Elon Musk v Sam Altman battle is a distraction Karen Hao

The Guardian

'If OpenAI lost its footing as the AI industry frontrunner, another barely distinguishable competitor - Musk's xAI or other - would simply replace it.' 'If OpenAI lost its footing as the AI industry frontrunner, another barely distinguishable competitor - Musk's xAI or other - would simply replace it.' If it wasn't already clear, Elon Musk and Sam Altman hate each other. While the two men were once cofounders of OpenAI, they're now locked in a vicious feud, playing out in all its theatrics in front of a judge and jury in a California courtroom. Musk is suing, alleging that Altman and OpenAI president Greg Brockman tricked him into forming and funding the organization as a non-profit before they subsequently restructured it to have a for-profit entity.




Learning to Elect

Neural Information Processing Systems

Voting systems have a wide range of applications including recommender systems, web search, product design and elections. Limited by the lack of general-purpose analytical tools, it is difficult to hand-engineer desirable voting rules for each use case. For this reason, it is appealing to automatically discover voting rules geared towards each scenario. In this paper, we show that set-input neural network architectures such as Set Transformers, fully-connected graph networks and DeepSets are both theoretically and empirically well-suited for learning voting rules. In particular, we show that these network models can not only mimic a number of existing voting rules to compelling accuracy -- both position-based (such as Plurality and Borda) and comparison-based (such as Kemeny, Copeland and Maximin) -- but also discover near-optimal voting rules that maximize different social welfare functions. Furthermore, the learned voting rules generalize well to different voter utility distributions and election sizes unseen during training.



Appendices619

Neural Information Processing Systems

AAdditional Experiments620 Task 1 - Grouping In addition to grouping clue words using token embeddings (discussed in621 the main paper 4), we also ran grouping the words by clustering on'contextual' embeddings. We622 experimentally induce'context' by joining the sixteen (16) word tokens (in a random order) into a623 single pseudo-sentence. The embeddings for each token were different based on the ordering of the624 tokens. We repeat the random ordering sixteen times and report the mean and variance of the results625 obtained in Table 6.626 Mean standard deviation over 16 random seeds is shown. Task 2 - Connections In addition to prompting based results on GPT-4 (discussed in 4), we ran627 experiments on additional LLMs like LLaMa [67] (7B, 13B) using pre-trained configuration weights628 obtained by permission from Meta AI. However, without additional fine-tuning on the specific task,629 these LLMs were unable to solve the task in a meaningful manner.